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Abstract 

The construction of the simultaneous confidence bands for the integrated haz- 
ard function is considered. The Nelson-Aalen estimator is used. The simultaneous 
confidence bands based on bootstrap methods are presented. Two methods of con- 
struction of such confidence bands are proposed. The weird bootstrap method is 
used for resampling. Simulations are made to compare the actual coverage prob- 
ability of the bootstrap and the asymptotic simultaneous confidence bands. It is 
shown that the equal-tailed bootstrap confidence band has the coverage probabil- 
ity closest to the nominal one. We also present application of our confidence bands 
to the data regarding survival after heart transplant. 
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1 Introduction and summary 



In biomedical settings, the multiplicative intensity model introduced by Aalen has 
many applications. This is a model for point processes observed on a fixed time in- 
terval for which the stochastic intensity is decomposed into deterministic function a(t) 
and stochastic process Y(t). The a(t) function may be considered as an individual 
force of transition at time t and Y(t) as a number at risk just before time t. 

In broad terms what makes survival data special is the presence of censored data. 
To analyze such data by the multiplicative intensity model a general assumption of 
independent censoring is required, which means that at any time t the survival expe- 
rience in the future is not statistically altered by censoring and survival experience in 
the past. The censoring mechanism is modelled by Y process and has not any influence 
on the a function. 

In the survival analysis the most interesting is to estimate the survivor function and 
the integrated hazard function. In this paper we consider the latter, which is estimated 
by the Nelson-Aalen estimator. An interpretation of this estimator is difficult without 
construction of some confidence intervals. From our perspective, the pointwise intervals 
are not totaly satisfactory while one wants to construct confidence region for the whole 
curve simultaneously for all points. 

The construction of the simultaneous confidence bands is difficult since we need the 
uniform consistency property. However, such confidence bands are badly needed in 
practical applications. For example, in the works related with ours like in the papers 
of Cowling, Hall , Phillips ([0]) and Snethlage ([H|) but also in the time series analysis 
(Leskow and Wronka and the nonparametric regression (Loader |13j). 

The formula for the asymptotic confidence interval for the Nelson-Aalen estimator 
is known, however, it is very complicated and does not work well for small samples 
(see PP). An alternative approach is through the use of bootstrap methods. This idea 
was first introduced by Efron ([Hj) and later developed in many papers (also in cited 
above). Bootstrapping of the point processes is not yet fully explored. Some results 
are presented in [I] and [Jj- The Poisson process context is treated in the paper 6 , 
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however these methods cannot be easily adapted to the multiplicative intensity model. 

The aim of our work is the construction of the bootstrap simultaneous confidence bands 
for the Nelson-Aalen estimator. We want to compare constructed bootstrap regions 
with the asymptotic ones. We make simulations to check if the actual coverage prob- 
ability is close to a nominal one. In our calculations we use the weird bootstrap method. 

We show that for the small samples the bootstrap models have much better coverage 
probabilities. Not only the actual coverage probabilities of the bootstrap simultaneous 
confidence bands are very close to nominal ones but also the left- and right-tail error 
probabilities are almost equal. 

Our paper is organized in the following way. Section |2 contains a short survey of 
basic results related to the Nelson-Aalen estimator and the bootstrap for point pro- 
cesses. Section |3] is dedicated to construction of simultaneous confidence bands for the 
estimator considered. A practical example related to heart transplant study is included 
in Section 01 while Section El contains additional numerical results. Conclusions and 
open questions are presented in Section H3 

2 Problem formulation 

In our paper we construct the bootstrap simultaneous confidence bands for the inte- 
grated intensity function. We use the weird bootstrap introduced in pQ. We compare 
our results with those presented in l._ and . Application of bootstrap is well motivated 
in the small sample case and when censoring mechanism is quite complex. Moreover, 
the standard asymptotic theory provides confidence intervals that are quite difficult to 
apply. To construct bootstrap simultaneous confidence bands we applied one of the 
methods proposed in [gj. 

While defining our problem we follow (page 176). We consider a continuous-time in- 
terval T which may be of the form [0, r] or [0, r) for a given terminal time r, < r < oo. 
Let (fl, J 7 ) be a measurable space equipped with a filtration ("Ft,t 6 T). We define on 
(f2, IF) a counting process N = (N(t),t £ T) adapted to the filtration such that its 
stochastic intensity function A is of the form X(t) = a(t)Y(t), where a is nonnegative 
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deterministic function and Y is a predictable process. For example, we can consider an 
initial group Yq of patients with cancer after some medical treatment. Although the 
patients enter the study at different calendar times, we observe only their time since 
operation. In this case a(t) is the individual intensity of death and Y(t) is the number 
at risk at the moment of time t e.g. number of patients who lived till time t. For a 
practical example see Section 0] 

The only assumption we have to make about a is its integrability, 

/ a(s)ds < oo for all t G T. 
Jo 

We consider the Nelson-Aalen estimator A for 

A(t) = I a(s)ds 

which is of the form 



o 



m = £ wry 

where Tj are jump times. 

We define an estimator for the mean squared error function as 

9»(t) = £ F <^- AJV ^> 



where AN(Tj) = N(Tj) - N(Tj 



3-D 



Under the suitable assumptions the Nelson-Aalen estimator is uniformly consistent 
on compact intervals (see page 190), which means: 

sup |l (n) (s) - A(s)\ ^0 asn^oo for t G T. 

sG[0,t] 

The asymptotic distribution of the Nelson-Aalen estimator can be obtained from Re- 
bolledo's martingale central limit theorem (for details see page 190). It should be 
pointed out that the problem of constructing simultaneous confidence bands requires 
a version of the functional central limit theorem for the cumulative intensity function. 
Such results can be found in (page 263), however the limiting distribution is quite 
difficult to apply in practice. Moreover, it is still unknown what form of the functional 
central limit theorem can be established for a alone. (See also Section HO for additional 
remarks regarding this problem). 
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The results above can be used to construct pointwise confidence intervals and simul- 
taneous confidence bands for A(t) (CP)- Unfortunately, formulae for the asymptotic 
distributions are very complicated. That is why we want to apply bootstrap methods 
to construct simultaneous confidence bands. Bootstrapping of counting processes is not 
easy because such processes are not based on i.i.d. samples. The problem is complex 
and, thus, the methods for the general case are not known. There are some results 
for the Poisson processes (see jH]), however in this case one may get similar results 
without simulations (see |14j). Some methods of bootstrapping point processes are also 
presented in [I] and [3]- 

In our paper we apply the weird bootstrap method. The idea is based on the fact that 
the asymptotic distribution of a n (A— A) has independent increments and Var(dA(t)\J r t-) = 
dA(t)(l - dA(t))/Y{t). The following definition is quoted from [T]. 

Definition 2.1 The Weird Bootstrap 

Given N, Y, and A, let N* be a process with independent binomial (Y(t), AA(t)) 
distributed increments at the jump times of N , constant between jump times. Let A* = 
J dN* /Y. Estimate the distribution of A — A by the conditional distribution, given N 
and Y, of A* - A. 

For the proof of consistency of this method see (page 220). 

The word weird is not accidental. In every time point t £ T every individual at risk 
from the set Y(t) has the same probability of a failure. However, the event at the time 
t does not exert any influence on any other time moment s GT. 

The problem of bootstrapping point processes is not completely solved and quite chal- 
lenging. Some partial solution are discussed in [5], [7j, |I] and In the next section 
we use this method of bootstrapping to construct the simultaneous confidence bands. 

3 Simultaneous confidence bands 

The Nelson-Aalen point estimator is difficult to interpret without some idea of its ac- 
curacy. Resolving this problem requires constructing confidence intervals or confidence 
bands. These bands are also quite interesting because of their hypothesis testing in- 
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terpretation. We can think of confidence bands as a one-sample test statistics with a 
null hypothesis A = Aq which is rejected at significance level 9 if Aq is not completely 
contained in the band. In this case pointwise confidence intervals are not satisfactory. 
That is why we introduce simultaneous confidence bands. 

Definition 3.1 Confidence region 

Let B denote a connected, nonempty, random subset of the rectangle [0, r] x [0,oo) ; 
such that B n {(x,y) : < y < oo} is nonempty for each x G [0,r]. We call B a 
confidence region for A over the set S E [0, r] with a coverage probability (1 — 9) if 
P{(x,A(x)) G B for all x G S)} = 9. 

In our paper S is always an interval. 

Simultaneous confidence bands may be constructed in many different ways. The au- 
thors of the book (page 209) proposed two types of such bands: EP-band (equal 
precision band) and HW-band (Hall-Wellner band). These confidence bands are based 
on the asymptotic distribution of the Nelson-Aalen estimator on compact intervals 
which can be derived from the martingale central limit theorem. 
Both EP- and HW-band for A on [t\ , t%\ are of the form 

A(s) ± a- l K qt e{c x , c 2 )(l + ala\s))/q( ff^} ) 

with K q fi(ci,C2) being the upper percentile of the distribution of 

sup \q{x)W°(x)\, 
xe[ci,c 2 ] 

where W° denotes the standard Brownian bridge. 
The constants c\ and ci can be approximated by 

where a n = ^Jn is a normalizing factor and n is the number of individuals at study. 
For EP-band q is chosen as q\{x) = — x)} -1 / 2 which yields the confidence bands 
proportional to the pointwise ones. For HW-band q is chosen as q2(x) = 1. 
In both cases 9 percentile of the asymptotic distribution are difficult to obtain. These 
bands also perform badly even with the sample size of 100-200 j2j- Because of this 
reason one may consider some transformations to improve the approximation to the 
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asymptotic distribution (page 211). 



To avoid such problems we consider bootstrap simultaneous confidence bands. The 
authors of the paper [g] proposed a few different methods of constructing these bands. 
In our calculations we use the weird bootstrap method. Our construction of bootstrap-t 
confidence regions for A is based on the bootstrap approximation 

a(x) 



of 



For details see pQ. 



rri . x A(x) - A(x) _ 
a(x) 



Below we present two bootstrap confidence bands: 

1. Confidence region is defined by 

B\ = {(x, y) : x £ S, max[0, A[x) — t\a{x)} < y < A(x) + t\d{x)}, 

where t\ is chosen such that 

P{\T*{x)\ < h, all x G S\N,Y} = 1-6. 

The main feature of this region is that at the point x its width is proportional to 
a(x). 

2. In many applications populations cannot be modelled via symmetric distribu- 
tions. The only reasonable choice is a strongly skewed distribution. In all of the 
previous presented intervals, skewness was not taken into consideration. This has 
a quite negative impact on the coverage probability. To adjust for skewness of 
the distribution one could construct a region which the left- and right -tail error 
probabilities are equal. This kind of the region is of the form 

£>2 = {(x, y) : x G S, max[0, A(x) - t 3 a(x)} < y < A(x) - t 2 a{x)}, 

where t 2 and t 3 are chosen such that 

P{t 2 < T*(x) < t 3 , all x G S\N, Y} = 1 — 9 
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and 

P{T*(x) < t 2 , all x e S\N,Y} = P{T*(x) > t 3 , all x e S\N,Y}. 
In the next section we present an example of applying such bands. 

4 Practical example 

We take under the consideration the group of 64 patients after heart transplant. The 
data we use are taken from Appendix A, pages 387-389. In our approach, the risk 
is defined as the rejection of the transplant so the time between the operation and the 
rejection is considered. 35 observations are censored. The censoring was present if 
patients were alive at the end of the study or lost to follow-up. The 95% confidence 
bands simultaneous with respect to the time argument were constructed in the time 
bandwidth between day 20 and day 1200 of the observation. The construction of 
such confidence interval was based on Nelson-Aalen estimator. Figure Q presents the 
Nelson— Aalen estimator together with HW and EP bands and Figure |2] with Bl and 
B2 bootstrap simultaneous confidence bands. Note that Bl, EP and HW bands are 
symmetric. Only B2 is not symmetric. The upper bands of Bl and B2 are covering 
themselves. The lower band of Bl is noticeably too low. It suggests that Bl is too 
wide. HW and EP bands are close to each other but EP is significantly broader during 
the most part of the time interval. Moreover, B2 is shifted upwards compared with the 
asymptotic simultaneous confidence bands. 

Now we will verify the actual coverage probability for the considered bands. 

5 Numerical results 

Our aim is to compare the coverage probability for asymptotic and bootstrap simulta- 
neous confidence bands. Our simulations are based on the multiplicative model for the 
intensity function X(t) = Y(t)a(t). We concentrate on a few typical examples of the 
a function. To generate process Y we first choose the beginning value Yq (the number 
of individuals at risk) and next for every individual the time of termination is sampled 
from exponential distribution with the mean value 0.25. Having such Y we generate 
the underlying point process. 
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Figure 1: HW- and EP-band 
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Figure 2: Bl- and £>2-band 



For our study we chose four functions 



ui(t) 



5 



3 



a 2 (t) 



a 3 (t) 
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Curves of such kinds can be applied in biomedicine, insurance and demography. For 
example the U-shaped functions may reflect behavior of the intensity of death and 
the inverted U-shaped functions may describe the intensity of birth. These shapes 
are reflected in the equation of a 2 and 04 functions. Figure |3] shows these intensity 
functions and Figure 0] presents integrated versions of these functions. 

We make simulations for the interval S = [0.2; 0.8], the number of bootstrap resamples 
B = 200 and initial number at risk Yq : 25, 50, 75. In Table Q we show the actual 
coverage probability, when the nominal coverage probability is 0.95 and the number 
of iterations is equal to 10000. For every Yq, a,i function {i = 1 . . . 4) and method of 
construction of the confidence region, the first and the second number in Tableware the 
left- and right-tail error probabilities and the third is the actual coverage probability 
(all probabilities are measured in percentage). 

As we expected, HW- and EP-band perform quite badly for the small samples. 
Especially for Yq = 25 the actual coverage probability is 5% to 10% less then it should 
be. This happens because these are asymptotic bands and in our case the number of 
jumps of the point processes is not big enough to apply the asymptotic distribution. 
For Yq = 50 the actual coverage probability for these bands is better but always remains 
about 3% smaller than the nominal one. For Yq = 75 all results are satisfactory. The 
first of the bootstrap confidence intervals which we proposed performs well for small lo 
but when the number of jumps rises it remains consistently too wide. The equal-tailed 
bootstrap confidence band (B2) behaves well in all considered situations. Its actual 
coverage probability is always close to nominal, even in the case of small beginning 
number at risk (when the asymptotic bands fail). Our simulations also show that the 
left-side failure probability for the EP- and HW-band is significantly too small. Its 
value is below 1%. This means that our functions oti(t) almost never cross the lower 
band of the confidence region e.g. the lower band goes too far away from the estima- 
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Figure 3: Intensity functions 




Figure 4: Integrated intensity functions 
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Y 


function 


method 


25 


50 


75 


ai 


HW 
EP 
Bl 
B2 


0.4 11.5 88.1 
0.4 12.7 87.0 
0.0 4.0 96.0 
2.4 4.0 93.6 


0.6 5.7 93.7 
0.6 6.6 92.8 
0.1 2.5 97.4 
3.2 2.6 94.3 


0.8 4.1 95.1 
0.8 4.7 94.5 
0.3 2.2 97.5 
3.1 2.2 94.7 




HW 
EP 
Bl 
B2 


0.5 10.1 89.5 
0.5 10.0 89.5 
0.1 3.6 96.3 
2.6 3.6 93.8 


0.9 6.5 92.7 
0.8 6.6 92.6 
0.2 3.4 96.4 
3.0 3.6 93.4 


1.0 4.7 94.3 

1.0 5.0 94.1 
0.3 2.6 97.1 

3.1 2.9 94.1 


«3 


HW 
EP 
Bl 
B2 


0.2 13.2 86.6 
0.2 14.8 85.0 
0.0 4.7 95.3 
2.3 4.7 93.0 


0.7 9.8 89.5 
0.5 9.4 90.1 
0.9 2.9 96.3 
2.9 2.2 95.0 


0.8 4.7 94.5 
0.7 5.6 93.7 
0.1 1.6 98.3 
2.9 1.7 95.5 


«4 


HW 
EP 
Bl 
B2 


0.7 13.8 85.8 
1.0 16.6 83.1 
1.0 5.2 94.7 
2.6 5.1 92.5 


0.5 6.7 92.8 
0.3 8.5 91.2 
0.1 2.3 97.6 
2.7 2.3 95.0 


0.9 4.4 94.8 
0.8 5.3 93.9 
0.1 1.5 98.4 
3.3 1.5 95.2 



Table 1: Actual coverage probability 



tor. The advantage of the B2 region is the equal tailed feature. The lack of coverage 
probabilities for the left-hand case and the right-hand case are almost equal. 
We checked empirically that B2 is the optimal choice. Independently of the beginning 
number at risk it has a coverage probability close to the nominal one and, what is very 
important, it insures almost equally divided failure probability. 

Now we compare our results with those presented in The authors of [5] proposed 
arcsine- and logarithmic-transform of the Nelson-Aalen estimator. They considered 
the modifications of EP- and HW-band which use these transformations. Such con- 
structed asymptotic simultaneous confidence bands perform satisfactionary for sample 
size as low as 25. 

Using simulation methods presented before we compare the behaviour of these bands to 
the bootstrap band B2. The results are presented in Table El AHW and AEP denote 
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the arcsine-trasform of HW- and EP-band respectively. The logarithmic-transform 
bands are denoted by LHW and LEP. 

As might be expected for the sample size 50 and 75 all methods give satisfactory 







Y 


function 


method 


25 


50 


75 




AHW 
AEP 
LHW 
LEP 
B2 


2.2 6.6 91.2 

2.3 5.9 91.8 

3.1 5.1 91.8 

3.2 3.8 93.0 

2.4 4.0 93.6 


2.7 3.1 94.2 
2.6 3.3 94.1 
3.5 2.4 94.1 
3.4 2.0 94.6 
3.2 2.6 94.3 


2.7 2.0 95.3 
2.7 2.0 95.3 
3.4 1.4 95.2 
3.3 1.3 95.4 
3.1 2.2 94.7 




AHW 
AEP 
LHW 
LEP 
B2 


2.3 4.8 92.9 
2.3 4.5 93.2 

3.0 3.2 93.8 

3.1 2.4 94.5 
2.6 3.6 93.8 


2.4 3.2 94.4 
2.3 3.3 94.4 
3.0 2.4 94.6 
2.9 2.0 95.1 
3.0 3.6 93.4 


2.4 2.2 95.4 
2.4 2.3 95.3 
2.9 1.6 95.5 

3.0 1.5 95.5 

3.1 2.9 94.1 




AHW 
AEP 
LHW 
LEP 
B2 


2.3 8.7 89.0 

2.2 6.6 91.2 
3.8 6.2 90.0 
3.8 4.5 91.7 

2.3 4.7 93.0 


2.5 3.8 93.7 
2.2 3.1 94.7 
3.5 2.8 93.7 
3.5 1.8 94.7 
2.9 2.2 95.0 


2.4 2.3 95.3 
2.3 2.3 95.4 
3.6 1.8 94.6 
3.6 1.1 95.3 
2.9 1.7 95.5 


04 


AHW 
AEP 
LHW 
LEP 
B2 


2.3 8.8 88.9 
2.3 7.3 90.4 
4.0 7.4 88.6 
3.6 5.7 90.7 
2.6 5.1 92.5 


2.5 3.0 94.5 
2.3 3.1 94.6 
3.9 2.5 93.4 

3.6 1.7 94.7 

2.7 2.3 95.0 


2.5 2.3 95.2 

2.4 2.1 95.5 

3.5 1.6 94.9 
3.3 1.1 95.6 
3.3 1.5 95.2 



Table 2: Actual coverage probability 

results. For a sample size 25 the bootstrap simultaneous confidence band B2 has better 
coverage properties than transformed asymptotic ones. The actual coverage probability 
of B2 is about 92.5% for all ai functions. It is about 2% closer to the nominal than the 
actual coverage probability of the transformed bands. At first sight LEP seems to be 
good choice but as the sample size grows it gets too wide. 
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However, considered transformations improve the actual coverage probability and the 
left- and right -tail error probabilities of the asymptotic bands B2 is still the best choice. 

6 Conclusions 

In many applications , the hazard function is much more interesting and relevant to es- 
timate than the integrated hazard function, but it is also more challenging to estimate. 
There are several approaches to that problem, the histogram based sieve estimator con- 
sidered in Leskow, Rozahski and Leskow ^U] being one of them. Unfortunately, the 
version of functional central limit theorem of such estimator is still an open question. 
Without such result construction of the simultaneous confidence bands is impossible. 

In our paper we showed that for the small samples the bootstrap simultaneous con- 
fidence bands behave better than the asymptotic ones. They also have better actual 
coverage probability. An advantage of the equal-tailed type confidence region is the 
balance of the left- and right-tail error probability. A disadvantage of all simultaneous 
regions considered in this paper is the lack of taking into a consideration the shape of 
the estimated function. The integrated hazard function is always nondecr easing. Un- 
fortunately, the lower confidence band decreases sometimes. It may be interesting to 
construct regions taking into consideration the known features of the estimated func- 
tion (for example monotonicity, unimodality). 

The other curious problem is bootstrapping of the point process. We consider only 
one method (the weird bootstrap). In the paper [gj other methods are proposed but 
only for Poisson processes. A method for obtaining bootstrap replicates for the one- 
dimensional point process is presented in \4] and its multi-dimensional version is also 
proposed. Because of deficient coverage properties in some cases, Braun and Kulperger 
proposed in 5] a technique for one-dimensional point process which uses the idea of 
re-colouring presented in [Jj. It remains an open question if these methods can be 
applied in a general case. 
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